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Abstract. Every weighted tree corresponds naturally to a cooperative game that we call a tree 
game; it assigns to each subset of leaves the sum of the weights of the minimal subtree spanned 
by those leaves. In the context of phylogenetic trees, the leaves are species and this assignment 
captures the diversity present in the coalition of species considered. We consider the Shapley value 
of tree games and suggest a biological interpretation. We determine the linear transformation M 
that shows the dependence of the Shapley value on the edge weights of the tree, and we also 
compute a null space basis of M. Both depend on the split counts of the tree. Finally, we 
characterize the Shapley value on tree games by four axioms, a counterpart to Shapley's original 
theorem on the larger class of cooperative games. We also include a brief discussion of the core of 
tree games. 



1. Introduction 

The Shapley value is arguably the most important solution concept for n-player cooperative 
games. Given a set of players N of size n = \N\ in a cooperative game v, the Shapley value (p(N, v) 
is the unique imputation vector that satisfies four "fairness" criteria (the Shapley axioms) that we 
shall discuss later. In this paper we consider the game vq- induced by an unrooted n-leaf tree T in 
which each edge is assigned a positive number called an edge weight. In this context, the players 
are represented by the leaves of the tree and the value of any coalition S is the total weight of the 
subtree spanned by the members of S. 

In a more applied context, we consider games induced by a phylogenetic tree in which players are 
species and the tree represents a proposed evolutionary relationship among the species. We suggest 
that a biological interpretation for the Shapley value is a notion of the average marginal diversity 
that a species brings to any group, and we study how the Shapley value depends on the edge weights 
and topology of the tree. 

One possible application of the Shapley value of a phylogenetic tree is the economic theory of 
biodiversity preservation [TUMTT]. In such contexts, quantifying the biological diversity of a species 
or a group of species is of great interest; many measures have been proposed (see, e.g., [3l [H [13]) . 
The Noah's ark problem [181 [5] asks how to prioritize species in a population if only some limited 
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number can be saved; we suggest that Shapley value provides a natural ranking criterion as it is 
provides a measure of the contribution each species brings to the diversity of a group. 

The literature applying game-theoretic solution concepts to an analysis of trees appears to be 
limited. One closely related example is Kar [BJ, who studies cost-sharing in a network structure 
and characterizes the Shapley value of the minimum cost spanning tree game of an arbitrary graph. 
Also, [5] as well as [TT] study values for games that arise from a tree structure. However, these three 
works differ from ours because there each node of a graph is considered as a player in the game, 
whereas we specifically study tree games and allow only leaves as players. Day and McMorris [5] 
propose suitable axioms for a consensus rule that will aggregate several phylogenetic trees into one 
consensus tree; this differs from the thrust of our work, which is to consider one tree and explore 
the interpretation and properties of the Shapley value of the associated tree game. 

In the next section we provide a biological interpretation for the Shapley value of phylogenetic 
trees. Then we discuss the mathematics of calculating the Shapley value on tree games, starting 
with some examples on small trees. We determine the linear transformation that shows how the 
Shapley value depends on the edge weights of the tree, and compute a null space basis that shows 
how to vary edge weights without changing the Shapley value. We also explain how these depend 
on the tree topology. We conclude this paper by developing an analogue of Shapley's theorem that 
characterizes the Shapley value on games by four axioms. We show that on the smaller class of tree 
games, the Shapley value is characterized by those four axioms plus an additional axiom. 



2. Phylogenetic Trees and the Shapley Value 

2.1. Phylogenetic trees. Evolutionary relationships between species are frequently represented 
by a phylogenetic tree. Evidence for such relationships can come from a variety of sources, such as 
genomic data or morphological comparisons, and much work has been done to develop methods for 
constructing a phylogenetic tree from such data (for surveys, see Felsenstein [4] and Semple-Steel 

ED- 

Phylogenetic trees are usually binary trees in which each internal node represents a bifurcation in 
some characteristic and the leaves are the species for which we have data. Each edge has a weight 
that represents some unit of distance between the nodes at its endpoints (for instance, it could be 
the time between speciation events). Figure Q] gives a small example of what a (rooted) phylogenetic 
tree could look like. However, in this paper we shall not be concerned with the location of the root 
of a tree, so all our trees will be unrooted. 




Figure 1. Example of a phylogenetic tree with species A-E with edge weights labeled. 
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Formally, we shall think of a phylogenetic tree T as an unrooted tree with leaf set N := {1, . . . , n} 
(representing the species in the population), edge set E, and and an edge weight ct^ for each edge k 
in E. 

2.2. The Shapley value. In cooperative game theory, a cooperative game is a pair [N, v) consisting 
of a set of players N = {1, 2, n} and a characteristic function v that takes every subset of N (called 
a coalition) to a real number (called the worth of the coalition) . The subset consisting of all players 
is called the grand coalition. Formally, if 2^ is the set of all subsets of N, then v : 2 N — > M. 
For instance, N could be a set of companies and v could describe the profit that each coalition of 
companies could make if the members of that coalition worked together. 

One of the basic questions in cooperative game theory is: if players work together to achieve some 
total worth (in our example, profit), how should players then distribute their worth (profit) among 
themselves? 

As all (Pareto efficient) solution concepts from cooperative game theory do, the value introduced 
by Shapley [TS] suggests a "fair" distribution of the total worth of the entire set of players N among 
the members of N. Given a cooperative game (N, v), the Shapley value is a vector (p — (ip^) defined 
by the formula 

(1) l p i (N,v) = -Y / ^-mn-s)\(v(S)-v(S-i)) 

TV. ' 

SCN 

ies 

where s = \S\ is the size of the coalition S and n — \N\ is the total number of players. 

The formula above has a sensible interpretation that suggests a rationale for the Shapley value to 
obtain a "fair" distribution. For a player i 6 N and a coalition S C N that contains i, the quantity 
v(S) — v(S — i) describes i's marginal contribution to the worth of S. Then, if we choose an ordering 
of the players (uniformly at random, in n! ways) and if i appears as the s-th person in that order, 
then i's marginal contribution will be v(S) — v(S — i) for each ordering in which the members of 
S — i appear before i and the members of N \ S appear after i. This may happen in (s — l)!(n — s)! 
ways. Hence the combinatorial form of |T]) reflects the Shapley value's interpretation as the expected 
marginal contribution that i makes. 

2.3. The Phylogenetic Tree Game. Given a phylogenetic tree T, we can define an associated 
cooperative game (N, vt) that we call a phylogenetic tree game. Let N be the set of leaves of the 
tree (species). For any subset S C N of species, consider the unique spanning subtree containing the 
members in S, and let v-r(S) be the sum of the edge weights of that spanning tree. Thus for each 
set S we may think of vq-(S) as a measure of the phylogenetic diversity [3] within S. This measure 
and its computational aspects have been studied much in recent years (see e.g., [TBI 171 f!2]). 

Then the pair (N, vt) naturally forms a cooperative game. Although species can hardly be 
compared with rationally acting agents (as usually assumed in theory of cooperative games), we 
may still ask for a meaningful re-interpretation of game-theoretic solution concepts such as the 
Shapley value in the context of phylogenetic trees. 

Given a phylogenetic tree game (N, vt), equation (JTJ) suggests that the Shapley value of a given 
species may be thought of as its average marginal diversity ; i.e., the average diversity the species 
can be expected to add to a group that it joins. So if tp+ > tpj, then species i can be thought to 
contribute a greater diversity to a group than species j might. 
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Example 1. From direct calculations using {I]) ; the five-leaf tree T in Figure\^has Shapley value 

V = (<Pa, <PB, fc, Vd,Ve) = (5.28, 6.78, 4.2, 4.95, 2.78) 
as we will show in Section [5TI?1 




Figure 2. A five-leaf tree. 



2.4. The Shapley Value Axioms. Besides the interpretation of the Shapley value as an average 
expected marginal contribution, there is an axiomatization of the Shapley value (see [15) ) that 
uniquely characterizes it by a set of (desirable) properties. We review the axioms presented by 
Shapley and discuss their plausibility in the present setting as properties of phylogenetic trees. Let 
therefore V := {v : 2 N — + K | v(0) = 0} be the set of all cooperative games with n players. 

(1) (Pareto Efficiency Axiom) The Shapley value is Pareto efficient, i.e., X^ejv Vi(N,v) = v(N) 
for all »6V. 

This axiom just states that the total diversity present within a phylogenetic tree will be 
distributed and ascribed to the species within it. This is a reasonable axiom, given that 
the purpose of a solution concept for a cooperative game is to distribute the worth of the 
grand coalition among its members. In this context, the natural interpretation is that the 
Shapley value answers the question of how much a specific species is responsible for the total 
diversity, or, put another way, what is its share of vq-(N). 

(2) (Symmetry Axiom) For any permutation of players n : N — > N the Shapley value satisfies 
tp(irv) = n(p(v), where irv is the permuted game given by irv(S) := v(tt (S)) for all S C N 
and irip(v) is the permuted solution vector, i.e., (ictp(v))i :— </?„■- 

The symmetry axiom states that a player's allocation should not be based on her name. 
Another consequence of the symmetry axiom is if exchanging two players causes no difference 
in the worth that each adds to any coalition, then they should have the same Shapley value. 
Biologically speaking, if two species play the same role within a tree then they should be 
ascribed the same responsibility for diversity, which seems to be a plausible requirement. 

(3) (Dummy Axiom) A dummy player is one that does not add worth to the value of any 
coalition. This axiom says that dummy players should have a Shapley value of zero. 

This axiom is vacuously satisfied in the case of a phylogenetic tree game because there 
are no dummy species. To see this, note that every species i adds worth to the coalition 
that consists of a single species j ^ i, because the weight of the subtree containing i and j 
is the sum of the edge weights between i and j and is therefore non-zero, but the weight of 
the subtree consisting of the singleton j is zero. (Even though there are no dummy species, 
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this is still a reasonable axiom here, since any species that does not diversify any coalition 
should get value zero.j^ 

(4) (Additivity Axiom) Given two games (N, v) and (N, w) in V with the same set of players N, 
define the sum game (N, v + w) with characteristic function (v + w)(S) — v(S) + w(S) for 
every coalition S. This axiom stipulates that the Shapley value of the sum game should be 
the sum of the Shapley values of the individual games: ip(N, v + w) = <f(N, v) + tp(N, w). 

As an example, suppose we are given nucleotide sequences for a set of species N, and each 
sequence has length 200. For each pair of species i,j consider the (rather crude) measure 
of distance d(i,j) to be the number of positions in which the sequences differ. The pairwise 
distance data can be used to construct a tree (using any standard method) and consequently, 
a tree game. Thus the first 100 positions of the sequences can be used to construct a tree 
game (N,vi), and the second 100 positions a tree game (N,V2). Then the Shapley value 
of the sum game (N,vi + v^) is the sum of the Shapley values for each game. This seems 
plausible in this context, since if the pairwise distances d{i,j) from both sets of 100 positions 
actually arise from a tree metrics on the same topological tree, then the sum game will arise 
from the tree reconstructed from all 200 positions. 

3. Examples and Motivation: The Shapley Value for Small Trees 

As can be seen from (JTJ, the Shapley value of a tree game is a linear function of the edge weights 
of the tree. We call that linear transformation the Shapley transformation. Before deriving a general 
formula for this transformation in the subsequent section, we study the Shapley transformation for 
games induced by unrooted three-, four-, five- and six-leaf trees. 

We will refer to the weights of edges incident to leaves as leaf weights and other edge weights as 
internal edge weights. Note that for an unrooted n-leaf tree, there are n — 2 internal nodes and n — 3 
internal edges in E. In what follows, the superscript T denotes the transpose. 

Definition 2. Let T be an n-leaf tree with leaves N = {1, . . . , n}, associated leaf weights ai, . . . , a n 
and internal edges I\ , . . . , J n _3 with associated internal edge weights ai t , . . . , otj n _ 3 . Let E be a vector 
consisting of the edge weights in this order: (ai, a n , ai 1 , ai n _ 3 ) T . Define M = M.(N,vt) to 
be the n x (2n — 3) matrix that represents the Shapley transformation, so that the Shapley value of 
the game v-r is 

<p(N, vt) = ((Pl, <P2, • • • , fnf = ME 

where ipi is the Shapley value associated with leaf i. Note that M depends on the topology of the 
n-leaf tree. 

Later in Theorem [4] we determine a formula for M[i, k], which is the coefficient of edge weight k 
in the calculation of the Shapley value of i. But first, we give a few examples. 

3.1. Three-Leaf Trees. Topologically, there is only one unrooted three-leaf tree T. Let the leaves 
represent players A, B, and C with corresponding leaf weights a, (3, and 7 as seen in Figure [3l 
The characteristic function vt for this game is 

v T (A) = v T (B) = v T (C) = 0, 

v T {AB) = a + (3, v T (AC) = a + 7, v T (BC) = (3 + 7, 



4n Section [6] we will replace the dummy axiom by a different one to characterize the Shapley value on the class 
of games that actually come from trees. 
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Figure 3. The topology of an unrooted three- leaf tree T where the players are A, 
B, and C with corresponding leaf weights a, /3, and 7. 



vt(ABC) =a + /3 + 7 . 

Using Definition [2] we can calculate the Shapley value by ip = (<pa, <Pb, <Pc) = Mi where I is the 



vector of leaf weights (a, j3, j) T and 



M =6 



4 1 1 
1 4 1 
1 1 4 



It is apparent that we can solve for a, (3, and 7 in terms of ip by inverting M: 



5 -1 -1 
-1 5 -1 
-1 -1 5 




This means the Shapley value of a 3-leaf tree uniquely determines the tree representing the game. 



3.2. Four- and Five-Leaf Trees. Similarly, we can calculate the Shapley value for each player in 
the four- and five-leaf cases. There is a unique tree topology for each shown in Figure 2J 




Figure 4. (left)The topology for an unrooted four-leaf tree where the players are 
A, B, C, and D. (right) The unrooted five-leaf tree with players A, B, C, D, and E. 



The Shapley value for the general four-leaf tree game is 
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Similarly for the five-leaf tree game, the Shapley value is 
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This formula produces the calculation in Example [1] 

It is apparent from the fact that there are more variables (edge weights) than equations that there 
is not a unique set of (possibly negative) edge weights for a given Shapley value. That is, there is 
not a unique tree corresponding to a given Shapley value. The null space of M will therefore help 
us determine which weighted trees have the same Shapley value. A basis for the null space of M for 
the four-leaf tree is 

r / -i/4 \ 

-1/4 
-1/4 
-1/4 
1 



This means that given a tree T, we can produce other trees with the same Shapley value by reducing 
the leaf weights by 1/4 for each unit increase in the internal edge weight. 
Similarly, a null space basis for the five-leaf tree is 



V 



-1/3 \ 
-1/3 
-1/9 
-1/9 
-1/9 
1 

/ 



-1/9 \ 
-1/9 
-1/3 
-1/3 
-1/9 


1 / 



For example, the first basis element (multiplied by -3) shows us that the tree in Figure [5] has the 
same Shapley value as the tree in Figure [2] 




Figure 5. A five-leaf tree with same Shapley value as Figured] 
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Figure 6. (left)The first topology for an unrooted six-leaf tree T where the players 
are A, B, C, D, E and F. (right) The second unrooted six-leaf tree T'. 



3.3. Six-Leaf Trees. For our last direct calculation, let us consider the games represented by six- 
leaf trees. In this case there are two topologies for unrooted trees with six leaves (see figure 
The Shapley value for the first and second six-leaf trees are, respectively, 
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As with the four and five leaf cases, both topologies of the six leaf tree allow for many trees to 
possess the same Shapley value. The basis for the null space of the first six-leaf tree is 
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and for the second six-leaf tree is 
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3.4. Notes on Relationship between Trees and Shapley Values. From these examples, we 
make a few observations. 

(1) Any Shapley value n-vector can be realized by adjusting the edge weights of an n-leaf tree. 
This may involve positive as well as nonpositive edge weights. However, the positive hull 
of the column vectors of the matrix M can be realized as the Shapley value of trees with 
nonnegative edge weights. 

(2) When n > 4, there is not a unique n-leaf tree corresponding to a given Shapley value because 
the null space is nontrivial. 

(3) The null spaces for the two six-leaf trees are different (since there is exactly one basis for 
each null space whose projection to the last 3 coordinates are the standard unit vectors in 
M 3 , and these bases are different for the given null spaces). Moreover, one may check that 
there is no permutation of the coordinates of one null space that will identify it with the 
other null space (we say such spaces are not permutation equivalent). As we shall see in 
Section if the null spaces are not permutation equivalent, then the two trees must not be 
isomorphic (Theorem [8|). 

(4) Under close inspection, one notices a relationship between the numbers of leaves on each 
side of an internal edge (the split counts) and quantities such as the entries of the Shapley 
transformation matrix and the null space basis vectors. We exhibit their explicit dependence 
in the following sections. 
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4. The Shapley Transformation 

We first show the contribution of each edge weight to the Shapley value; these are the entries of 
the matrix M representing the Shapley transformation. The following theorem gives us a quick way 
of finding the (i, k)th entry of M. Before we state and prove the theorem, we need a definition that 
will be instrumental throughout the rest of this paper. 

Definition 3. Let T be an n-leaf tree with leaves N and edges E. Fori £ N and k G E, the removal 
of edge k splits T into two subtrees. Let C(i, k) denote the set of leaves in the subtree that contains 
i (the "containing" subtree) and let J-(i,k) denote the set of leaves in the other subtree that is "far" 
from i. We then denote the number of leaves ofC(i, k) and T(i, k) as c(i, k) and f(i,k), respectively. 

If it is obvious what leaf i and edge k we are referring to, we will simply write c, / instead of 
c(i, k), f(i,k). Note that n = c + f. We call c, / the split counts associated with leaf i and edge k. 
As we shall see, the split counts will arise frequently in our results on the Shapley transformation. 

Theorem 4. Let T be an n-leaf tree. The (i, k)th entry of the Shapley transformation matrix M is 
given by 



n c(j, k) 

Proof. Fix leaf i. To count the number of times a given edge weight contributes to i's Shapley value, 
we need to know how many times it is in the marginal contribution of i for coalitions of size s. Edge 
weight afc will be part of i's marginal contribution if the other s — 1 members of the coalition are 
from the far side of the edge from i. So 



M[i, k 

Using the fact / = n — c, the above expression can be rewritten: 
1 - 

-y>-c)!(c-l)! 
We use the identity 



n\j^ '\s-lj n!^(/(i,fc)-* + l)! 



1 vv w i\if n ~ s \ (w-c)l(c-i)!^ A'-i 



. c — 1 / n\ ^— ^ Vc— 1 

s=2 v ' j=l 



to obtain 



M[i,k] 



(n-c)\(c-l)\fn-l\f f 



c — 1 / c nc 



□ 



This result is particularly nice because it shows how the Shapley value's dependence on any edge 
weight simply hinges on the number of leaves on either side of that edge. Consider the following 
example. 

Example 5. Using Theorem^ we will calculate the coefficient of fi in player A 's Shapley value for 
a five-leaf tree. Let the edge with edge weight fi be L\. There are three leaves in T(A,Ii) and two 
leaves in C(A, Li). Thus, 

M[1,6] = JL 

which is the same as the (A, /i) entry 36/120 in the Shapley transformation of the five-leaf tree given 
in section[3J£ 
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5. The Null Space of the Shapley Transformation 

Now we will also use Theorem 0] to understand the dependence of the null space of the Shapley 
transformation on the split counts, as suggested in Section [3.41 

The following theorem exhibits a null space basis of M in terms of the split counts. 

Theorem 6. Let T be an n-leaf tree with leaves N = {1, . . . , n} and internal edges I\, . . . , I n -3- The 
dimension of the null space of M = M(iV, vq-) is n — 3. A basis for the null space is the collection 
of vectors {wi k } in R 2 "~ 3 , one for each internal edge 1^: 

ifl<i<n 

(2) («>lJi=U ifi = n + k 

I otherwise 

for all k £ {1, . . . , n — 3} and entries i £ {1, . . . , In — 3}, where the first n entries correspond to 
leaves and the last n — 3 entries correspond to internal edges. 

Before proving the theorem, we give an example. 

Example 7. Consider the five-leaf tree in Figure^ Let /i, Ja be the internal edges with weight [i,p, 
respectively. We use Theorem^ to determine the null space vector u>i 1 . The 5 + 1 = 6-th entry of 
wi 1 is 1 and all entries after that are 0. To find the first five entries of the vector, consider the two 
subtrees obtained by removing I\ from the tree, namely, the subtrees AB and CDE. By 0), the first 
two entries of the corresponding to A and B will be 

3-1 1 
"(5 - 2)2 ~~ ~3 
and the next three entries corresponding to C, D, and E are 

2-1 1 



(5 - 2)3 9 

This agrees with the first null space basis vector we exhibited in Section \3. 2[ (The other basis vector 
there is wi 2 .) 

Now we prove Theorem [5] 

Proof. Let T be an n-leaf tree. Consider the ith leaf. If we let M be the matrix of Shapley value 
coefficients for T then we want to show 

2n-3 

(3) ^Mli.ilK^O. 

j=i 

Fix k £ {1, . . . , n — 3}. Note that by Theorem QJ for all leaves j ^ i, M[i,j] = n ^-\) an d 
M[i,£] = The only other non-zero entry of M we need to consider is the one associated with 

the (n + fc)-th edge of the tree, i.e., the internal edge Ik- The split counts for are c, / and so 

J_ 

n c 

Thus, showing © is the same as showing that 

-/ • 1 . c ~ 1 - ( c - i) . 1 . /- 1 _ V^l . /- 1 + J_ = o 

n(n — 1) [n — 2)/ n(n — 1) (n — 2)c n (n — 2)c n c 

which can be checked by algebraic manipulation and the fact that n = f + c. 



M[i,n + k] 
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Thus wi k is in the null space of the Shapley transformation M. It is apparent that the null space 
has dimension n — 3 and the wi k are linearly independent. Therefore the wi k form a basis of the 
null space of M. □ 

We note that this basis {wi k }, k — 1, ...,n — 3, is uniquely determined by fixing the last n — 3 
coordinates to be the standard basis vectors in R™ -3 . For this reason we refer to this basis as the 
standard null space basis for M. 

An immediate corollary of Theorem [6] is that the standard basis of Null(M.) reveals which pairs 
of leaves form cherries. A pair of leaves is called a cherry if they have a common parent. This 
is the case if and only if the tree spanned by i and j does not include an internal edge. Therefore, 
removing the internal edge k that contains the common parent will split the tree into a 2-leaf subtree 
and an (n — 2)-leaf subtree, and in such a situation (as long as n > 4) we expect to find exactly two 
entries in wi k whose values — (n — 3)/2(n — 2) correspond to the two cherry leaves. The examples 
in Section [3721 and Section l3~3l nicely illustrate this fact. 

Call two trees isomorphic if there is a bijection between edges that takes one tree to the other 
and preserves the topological structure of the tree. Call two matrices permutation- equivalent if one 
can be obtained from the other by a permutation of the rows and a permutation of the columns. 
Call two subspaces of K™ permutation-equivalent if one set can be obtained from the other by some 
permutation of the coordinates. 

Since the split counts of a tree only depend on the topology of the tree, Theorem |4] shows that 
isomorphic trees will produce the same Shapley transformation matrix M up to a permutation of 
the rows (given by permuting the order of leaves that define the rows) and a permutation of the 
columns (given by a permuting the order of the edges that define the columns). The null space of 
M is not affected by permuting the rows of M, but permuting the columns of M has the effect of 
permuting the coordinates of the null space of M. Therefore we summarize: 

Theorem 8. Isomorphic trees induce permutation- equivalent Shapley transformation matrices with 
permutation- equivalent null-spaces. Hence, if for two trees T\,T2, their Shapley transformation ma- 
trices Mi,M2 or their null spaces are not permutation-equivalent, then T\,T2 must not be isomor- 
phic. 

6. Characterization of the Shapley Value of Tree Games 

The Shapley axioms presented in Section 12.41 uniquely characterize the Shapley value on the 
class of all n-person games. However, the class of n-person games that are derived from a tree is 
smaller. Thus while the Shapley axioms still hold for this smaller class, they may no longer uniquely 
determine the Shapley value as a function on this class. In this section we will therefore strengthen 
the axioms so that they once again uniquely characterize the Shapley value on the class of n-person 
games derived from a tree. 

By V^'^ we denote the class of games arising from some tree with set of leaves N and edge set 
E. For games in V N ' E we will allow positive as well as non-positive edge weights. Thus, V"^ is a 
linear space and we ask for its dimension. 

For a fixed pair (N, E) define games (k 6 E) in the following way: corresponds to the tree 
in which edge k is weighted 1 and all other edges are weighted zero. We call such a game a basis 
game. It is readily checked that the game v associated with the tree that exhibits edge weights 
c»H, ... , a„, ai 1 , . . . , ai n _ 3 is the linear combination v = J2k£E a kVk- Moreover, the family (vk)keE 
is linearly independent. Therefore these games form a basis of V N,E and dim V*'^ = 2n — 3. 
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Note that in this context, the Shapley transformation M, as a n x (2n — 3) matrix, can be viewed 
as a linear transformation from V N,E to W 1 . 

To characterize this transformation, we ask what properties (axioms) we might expect a " diversity 
measure" ip : V N,E —> R™ to satisfy. Thus, given a tree game v, ip(v) is a vector in R™ which 
specifies for each of n leaves (players) a number that measures, in some sense, their contribution to 
the diversity of a group. For instance, for the basis game Vk, let us consider what a "reasonable" 
distribution ip(Vk) € R n might be. We may interpret zero edge weights on either side of the edge 
k in the basis game Vk as having two groups of species, each one being homogeneous. So a natural 
property would be that the degree of diversity that we assign to one group only depends on the 
size of this group (and hence the size of the other group) relative to the whole population. It seems 
plausible that a given group on one side of an edge diversifies the population more if there are more 
species on the other side of that edge. Thus we may assume that ipi(vk) is described by a function 
that is increasing in the fraction f(i, k)/n. We formulate these considerations as an additional axiom. 

Axiom (group proportionality on basis games): For fixed N and E, a mapping ip : V N,E 
is said to satisfy group proportionality on basis games, if there is some constant d € R such that ip 
satisfies J2jec(i,k) i>j( v k) = d for all i <E N, k e E. 

Thus, with ip satisfying this axiom, a group's assigned diversity linearly changes with the other 
group's fraction of the whole population. Using the new axiom, we get a characterization result for 
the Shapley value of games in V N ' E , which may be regarded as a counterpart to Shapley's original 
theorem |15j characterizing the Shapley value on all games. 

Theorem 9. For each pair (iV, E) ( consisting of leaf set N and edge set E ) there is one and 
only one mapping ip : \> N ' E — > R™ that satisfies Pareto efficiency, symmetry, additivity and group 
proportionality. This mapping coincides with the Shapley value ip restricted to \> N > E } i.e., ip(vq-) = 
(P(N,Vt) based on the phylogenetic diversity function vt- 

Proof. It is immediately verified that the Shapley value satisfies all the axioms (for group propor- 
tionality, use Theorem [4J. 

Now, let (N,E) be fixed and ip satisfy the axioms. First, we take a basis game Vk and determine 
ip. By symmetry, we may conclude ipi(i)k) — ^Pj{vk) as long as i,j are on the same side of edge k. 
Hence 

(4) ipj(vk) = c(i,k)ipi(v k ). 

jeC(i,k) 

Pareto efficiency and group proportionality imply 

Hence d = 1 and by group proportionality and Q, we obtain ipi(vk) = ncl^l) ^ or an y * ^ ^ anc ^ 
k E E. Analogously, we get ipi(Xvk) = \tpi(vk) for A G R. Using additivity and Theorem [U ip 
coincides with the Shapley value on V N ' E . □ 

We close this section with two remarks. First, note that any game arising from a tree with 
nonnegative edge weights is representable as a linear combination of basis games using nonnegative 
coefficients. Hence, we may derive a version of Theorem [5] for classes of games that actually arise 
from phylogenetic trees. 
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Second, Theorem [5] provides further justification for the use of the Shapley value to analyze 
phylogenetic trees. If one wants to distribute the total diversity of a population on its species and 
the distribution rule should satisfy the above (reasonable) axioms, then the Shapley value is the only 
possible choice. As symmetry, Pareto efficiency and additivity are rather "obligatory" requirements 
for a plausible rule, it is the proportionality axiom that provides further insight in the rationale 
behind the Shapley value. Of course, modification of the group proportionality axiom eventually 
leads to a different distribution rule based on a different rationale. 

7. The Core of Tree Games 

In prior sections, we have explored the Shapley value as a solution concept for tree games. How- 
ever, another solution concept for n-player cooperative games that is frequently studied is the core of 
a game, which is the set of all imputations x G K n such that for all coalitions S C N, X^eS Xi — V (S) 
and X^gjv Xi ~ v{N). In this section we examine the core of phylogenetic tree games. 

We start with three- and four-leaf tree games for intuition. 

Example 10. The characteristic function of the three-leaf tree game is given in Section \3. 11 and 
yields the following system of inequalities for the core: 

xa + x B + x c = a + (3 + 7 

xa + xb > a + (3 

xa + x c > a + 'f 

xb + %c > P + 7 

("\ 

Hence the core consists of the single element I, the vector of leaf weights \ (3 I . 

V 7 J 

Thus the three-leaf tree has only one element in its core, namely the vector of leaf weights. Now 
we consider the four-leaf tree game, which, unlike the three-leaf tree, has an internal edge. 

Example 11. The characteristic function of the four-leaf tree game in Figure^ yields the following 
system of inequalities for the core: 

XA+XB+xc+XD=a + (3 + n + "{ + 5 

(5) xa + xc > a + i^i + 7 

(6) xb + x D > (3 + (i + S 



From {5p and (Of we see that 

a + fi + 7 < xa + x c < a + 7. 

So either \x = in which case we have a degenerate tree (internal edge weight zero) and the core is 
I, or the core has to be empty since the inequality cannot be satisfied. 

These two examples illustrate the following theorem: 

Theorem 12. Let T be an n-leaf game tree T where n > 3. // the tree is degenerate (all internal 
edge weights are zero), then the core consists of the leaf weight vector t. Otherwise the core is empty. 
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Proof. Let T be an n-leaf tree with edge weights a* for % e {1, . . . , In — 3}. Every tree has at least 
two cherries, where a cherry is a set of two leaves with a common parent. Label the two leaves on 
one cherry 1 and 2 and label the two leaves on the other cherry 3 and 4 each with corresponding leaf 
weights ax, ct2, «3 and a^. We know from the properties of the core that for the set of leaves N, 

(7) Yl x i = ai 

jeN »e{l,---.2n-3} 

(8) xi +x 3 >22 "A; 

(9) x i = H a k 

ieiv\{i,3} feeQ 

where P is the set of edges in the subtree spanned by 1 and 3 and Q is the subtree spanned by all 
other leaves. Note that Q must contain all edges in the tree except for the leaf edges associated with 
1 and 3. Thus from Q and jH) we get 

(10) xi + x 3 < ai + a 3 . 
Then from ((8]) and (fT0|) we must have 

2J a k < x\ + x 3 < a>i + a 3 . 

keP 

However this cannot be satisfied (the core is empty) unless all of the internal edge weights in P are 
zero. But that every internal edge is in a subtree spanned by pairs of cherries, hence all internal 
edges weights are zero. In the latter case, the tree is degenerate and the core is the single element 

i. □ 

Notice that for n = 3, T is always degenerate, and thus the core will never be empty. 
Because the core of tree games is empty in most interesting cases, the Shapley value is a far more 
valuable solution concept to consider. 



8. Conclusion 

In this paper we have presented a biological interpretation of the Shapley value on games derived 
from phylogenetic trees. We have determined the linear transformation M that produces the Shapley 
value from the edge weights of the tree. We also determined its null space. It is worth noting again 
the dependence of these results on the split counts of the tree. Finally, we characterized the Shapley 
value on the space of tree games by four axioms, in much the same way as Shapley did for the space 
of all games. 

We close the paper with some speculation. One of our primary motivations for studying properties 
of the Shapley value of phylogenetic tree games was for the possibility of using game-theoretic 
concepts to reconstruct trees from data. Our results on the properties of the Shapley transformation 
suggest several directions for further research. For instance: 

• If there were a way to estimate the Shapley value from data (such as by quantifying the 
notion of diversity of populations), this would be enough to determine edge weights of a 
degenerate tree. Do the leaf weights of this tree have any significance? 

• Is there a way to determine or estimate split counts from data, and can this assist in deter- 
mining the correct tree topology? 
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• Does the converse of Theorem [5] hold, i.e., if two trees have permutation-equivalent Shapley 
transformation matrices or permutation-equivalent null spaces, are they isomorphic? 

• For a given n-leaf tree topology, the Shapley transformation takes a vector of leaf weights to 
a vector of Shapley values. However, one may speak of the space of all weighted n-leaf trees 
(of various tree topologies), as in pQ, and we can therefore view the Shapley transformation 
as a map (the Shapley map) from the space of trees to a vector of Shapley values. However, 
the space of trees is naturally embedded in r( 2 ), the space of pairwise distances. Is there a 
"natural" extension of the Shapley map to this space? How does the kernel of the Shapley 
map extend the null spaces of Theorem [5]? Can this map be used to reconstruct trees? 

• If we use the Shapley value to rank the species in the Noah's ark problem for preservation, 
to what extent can we guarantee that the diversity of the top k species (i.e., the weight of 
the subtree spanning them) approximates the total diversity of all n species? Determine a 
bound that depends on k and n. 
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